Overview

Dataset statistics

Number of variables16
Number of observations29531
Missing cells88488
Missing cells (%)18.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.6 MiB
Average record size in memory128.0 B

Variable types

Categorical3
Numeric13

Alerts

Date has a high cardinality: 2009 distinct values High cardinality
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NO is highly correlated with PM10 and 1 other fieldsHigh correlation
NO2 is highly correlated with PM10 and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
CO is highly correlated with AQIHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with Benzene and 1 other fieldsHigh correlation
Xylene is highly correlated with Benzene and 1 other fieldsHigh correlation
AQI is highly correlated with PM2.5 and 2 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 3 other fieldsHigh correlation
NO is highly correlated with PM10 and 1 other fieldsHigh correlation
NO2 is highly correlated with NOx and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
CO is highly correlated with AQIHigh correlation
Benzene is highly correlated with TolueneHigh correlation
Toluene is highly correlated with BenzeneHigh correlation
AQI is highly correlated with PM2.5 and 3 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 1 other fieldsHigh correlation
NO is highly correlated with NOxHigh correlation
NOx is highly correlated with NOHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with BenzeneHigh correlation
Xylene is highly correlated with BenzeneHigh correlation
AQI is highly correlated with PM2.5 and 1 other fieldsHigh correlation
City is highly correlated with PM10 and 5 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 2 other fieldsHigh correlation
PM10 is highly correlated with City and 4 other fieldsHigh correlation
NO is highly correlated with PM10 and 1 other fieldsHigh correlation
NO2 is highly correlated with AQIHigh correlation
NOx is highly correlated with NOHigh correlation
NH3 is highly correlated with CityHigh correlation
CO is highly correlated with City and 2 other fieldsHigh correlation
SO2 is highly correlated with City and 2 other fieldsHigh correlation
Benzene is highly correlated with TolueneHigh correlation
Toluene is highly correlated with BenzeneHigh correlation
AQI is highly correlated with City and 6 other fieldsHigh correlation
AQI_Bucket is highly correlated with City and 3 other fieldsHigh correlation
PM2.5 has 4598 (15.6%) missing values Missing
PM10 has 11140 (37.7%) missing values Missing
NO has 3582 (12.1%) missing values Missing
NO2 has 3585 (12.1%) missing values Missing
NOx has 4185 (14.2%) missing values Missing
NH3 has 10328 (35.0%) missing values Missing
CO has 2059 (7.0%) missing values Missing
SO2 has 3854 (13.1%) missing values Missing
O3 has 4022 (13.6%) missing values Missing
Benzene has 5623 (19.0%) missing values Missing
Toluene has 8041 (27.2%) missing values Missing
Xylene has 18109 (61.3%) missing values Missing
AQI has 4681 (15.9%) missing values Missing
AQI_Bucket has 4681 (15.9%) missing values Missing
Benzene is highly skewed (γ1 = 21.30421849) Skewed
NOx has 740 (2.5%) zeros Zeros
CO has 2328 (7.9%) zeros Zeros
Benzene has 3802 (12.9%) zeros Zeros
Toluene has 2861 (9.7%) zeros Zeros
Xylene has 1747 (5.9%) zeros Zeros

Reproduction

Analysis started2021-12-26 14:32:28.502799
Analysis finished2021-12-26 14:32:49.630384
Duration21.13 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

City
Categorical

HIGH CORRELATION

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size230.8 KiB
Mumbai
2009 
Delhi
2009 
Chennai
2009 
Bengaluru
2009 
Lucknow
2009 
Other values (21)
19486 

Length

Max length18
Median length8
Mean length8.275744133
Min length5

Characters and Unicode

Total characters244391
Distinct characters38
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAhmedabad
2nd rowAhmedabad
3rd rowAhmedabad
4th rowAhmedabad
5th rowAhmedabad

Common Values

ValueCountFrequency (%)
Mumbai2009
 
6.8%
Delhi2009
 
6.8%
Chennai2009
 
6.8%
Bengaluru2009
 
6.8%
Lucknow2009
 
6.8%
Ahmedabad2009
 
6.8%
Hyderabad2006
 
6.8%
Patna1858
 
6.3%
Gurugram1679
 
5.7%
Visakhapatnam1462
 
5.0%
Other values (16)10472
35.5%

Length

2021-12-26T20:02:49.708534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mumbai2009
 
6.8%
delhi2009
 
6.8%
chennai2009
 
6.8%
bengaluru2009
 
6.8%
lucknow2009
 
6.8%
ahmedabad2009
 
6.8%
hyderabad2006
 
6.8%
patna1858
 
6.3%
gurugram1679
 
5.7%
visakhapatnam1462
 
5.0%
Other values (16)10472
35.5%

Most occurring characters

ValueCountFrequency (%)
a46303
18.9%
r21033
 
8.6%
u15396
 
6.3%
n15294
 
6.3%
h13678
 
5.6%
i13664
 
5.6%
e11353
 
4.6%
m10991
 
4.5%
d8334
 
3.4%
t8306
 
3.4%
Other values (28)80039
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter214860
87.9%
Uppercase Letter29531
 
12.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a46303
21.6%
r21033
9.8%
u15396
 
7.2%
n15294
 
7.1%
h13678
 
6.4%
i13664
 
6.4%
e11353
 
5.3%
m10991
 
5.1%
d8334
 
3.9%
t8306
 
3.9%
Other values (13)50508
23.5%
Uppercase Letter
ValueCountFrequency (%)
A4294
14.5%
B3236
11.0%
C2699
9.1%
J2283
7.7%
G2181
7.4%
T2037
6.9%
M2009
6.8%
L2009
6.8%
D2009
6.8%
H2006
6.8%
Other values (5)4768
16.1%

Most occurring scripts

ValueCountFrequency (%)
Latin244391
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a46303
18.9%
r21033
 
8.6%
u15396
 
6.3%
n15294
 
6.3%
h13678
 
5.6%
i13664
 
5.6%
e11353
 
4.6%
m10991
 
4.5%
d8334
 
3.4%
t8306
 
3.4%
Other values (28)80039
32.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII244391
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a46303
18.9%
r21033
 
8.6%
u15396
 
6.3%
n15294
 
6.3%
h13678
 
5.6%
i13664
 
5.6%
e11353
 
4.6%
m10991
 
4.5%
d8334
 
3.4%
t8306
 
3.4%
Other values (28)80039
32.8%

Date
Categorical

HIGH CARDINALITY

Distinct2009
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Memory size230.8 KiB
2020-05-13
 
26
2020-06-18
 
26
2020-06-21
 
26
2020-05-22
 
26
2020-04-14
 
26
Other values (2004)
29401 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters295310
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-01-01
2nd row2015-01-02
3rd row2015-01-03
4th row2015-01-04
5th row2015-01-05

Common Values

ValueCountFrequency (%)
2020-05-1326
 
0.1%
2020-06-1826
 
0.1%
2020-06-2126
 
0.1%
2020-05-2226
 
0.1%
2020-04-1426
 
0.1%
2020-05-2726
 
0.1%
2020-06-1326
 
0.1%
2020-05-1626
 
0.1%
2020-05-2526
 
0.1%
2020-03-2826
 
0.1%
Other values (1999)29271
99.1%

Length

2021-12-26T20:02:49.786642image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-05-1326
 
0.1%
2020-06-2626
 
0.1%
2020-03-1126
 
0.1%
2020-06-1126
 
0.1%
2020-03-1926
 
0.1%
2020-04-0926
 
0.1%
2020-04-0626
 
0.1%
2020-04-2526
 
0.1%
2020-05-0926
 
0.1%
2020-04-0126
 
0.1%
Other values (1999)29271
99.1%

Most occurring characters

ValueCountFrequency (%)
070666
23.9%
-59062
20.0%
251607
17.5%
149700
16.8%
912479
 
4.2%
811560
 
3.9%
79799
 
3.3%
69198
 
3.1%
58529
 
2.9%
37101
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number236248
80.0%
Dash Punctuation59062
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
070666
29.9%
251607
21.8%
149700
21.0%
912479
 
5.3%
811560
 
4.9%
79799
 
4.1%
69198
 
3.9%
58529
 
3.6%
37101
 
3.0%
45609
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
-59062
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common295310
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
070666
23.9%
-59062
20.0%
251607
17.5%
149700
16.8%
912479
 
4.2%
811560
 
3.9%
79799
 
3.3%
69198
 
3.1%
58529
 
2.9%
37101
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII295310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
070666
23.9%
-59062
20.0%
251607
17.5%
149700
16.8%
912479
 
4.2%
811560
 
3.9%
79799
 
3.3%
69198
 
3.1%
58529
 
2.9%
37101
 
2.4%

PM2.5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11716
Distinct (%)47.0%
Missing4598
Missing (%)15.6%
Infinite0
Infinite (%)0.0%
Mean67.45057795
Minimum0.04
Maximum949.99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:49.864748image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.04
5-th percentile13.206
Q128.82
median48.57
Q380.59
95-th percentile193.96
Maximum949.99
Range949.95
Interquartile range (IQR)51.77

Descriptive statistics

Standard deviation64.66144946
Coefficient of variation (CV)0.9586493018
Kurtosis21.13222159
Mean67.45057795
Median Absolute Deviation (MAD)23.43
Skewness3.369959851
Sum1681745.26
Variance4181.103046
MonotonicityNot monotonic
2021-12-26T20:02:49.960653image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1119
 
0.1%
20.7512
 
< 0.1%
27.8211
 
< 0.1%
1510
 
< 0.1%
11.8110
 
< 0.1%
28.4510
 
< 0.1%
29.7510
 
< 0.1%
47.4310
 
< 0.1%
18.8110
 
< 0.1%
18.369
 
< 0.1%
Other values (11706)24822
84.1%
(Missing)4598
 
15.6%
ValueCountFrequency (%)
0.041
< 0.1%
0.161
< 0.1%
0.241
< 0.1%
0.281
< 0.1%
0.981
< 0.1%
0.991
< 0.1%
1.141
< 0.1%
1.191
< 0.1%
1.251
< 0.1%
1.391
< 0.1%
ValueCountFrequency (%)
949.991
< 0.1%
917.771
< 0.1%
916.671
< 0.1%
914.941
< 0.1%
914.641
< 0.1%
894.751
< 0.1%
868.661
< 0.1%
858.731
< 0.1%
832.81
< 0.1%
821.421
< 0.1%

PM10
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12571
Distinct (%)68.4%
Missing11140
Missing (%)37.7%
Infinite0
Infinite (%)0.0%
Mean118.1271029
Minimum0.01
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:50.080087image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile26.365
Q156.255
median95.68
Q3149.745
95-th percentile303.34
Maximum1000
Range999.99
Interquartile range (IQR)93.49

Descriptive statistics

Standard deviation90.60510972
Coefficient of variation (CV)0.767013729
Kurtosis6.747873494
Mean118.1271029
Median Absolute Deviation (MAD)43.92
Skewness2.0531891
Sum2172475.55
Variance8209.285907
MonotonicityNot monotonic
2021-12-26T20:02:50.172824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
949
 
< 0.1%
33.817
 
< 0.1%
87.026
 
< 0.1%
39.466
 
< 0.1%
102.176
 
< 0.1%
109.676
 
< 0.1%
72.046
 
< 0.1%
20.536
 
< 0.1%
43.16
 
< 0.1%
84.086
 
< 0.1%
Other values (12561)18327
62.1%
(Missing)11140
37.7%
ValueCountFrequency (%)
0.011
< 0.1%
0.021
< 0.1%
0.031
< 0.1%
0.042
< 0.1%
0.061
< 0.1%
0.071
< 0.1%
0.132
< 0.1%
0.142
< 0.1%
0.161
< 0.1%
0.172
< 0.1%
ValueCountFrequency (%)
10001
< 0.1%
9852
< 0.1%
917.081
< 0.1%
847.411
< 0.1%
802.871
< 0.1%
796.881
< 0.1%
768.161
< 0.1%
763.581
< 0.1%
761.911
< 0.1%
743.981
< 0.1%

NO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5776
Distinct (%)22.3%
Missing3582
Missing (%)12.1%
Infinite0
Infinite (%)0.0%
Mean17.57472966
Minimum0.02
Maximum390.68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:50.282172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile1.7
Q15.63
median9.89
Q319.95
95-th percentile61.19
Maximum390.68
Range390.66
Interquartile range (IQR)14.32

Descriptive statistics

Standard deviation22.78584633
Coefficient of variation (CV)1.296511911
Kurtosis25.16434683
Mean17.57472966
Median Absolute Deviation (MAD)5.64
Skewness3.883166275
Sum456046.66
Variance519.1947932
MonotonicityNot monotonic
2021-12-26T20:02:50.393397image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.9334
 
0.1%
8.7829
 
0.1%
7.7829
 
0.1%
0.9228
 
0.1%
1.9427
 
0.1%
0.9727
 
0.1%
0.926
 
0.1%
2.8926
 
0.1%
7.9726
 
0.1%
5.2325
 
0.1%
Other values (5766)25672
86.9%
(Missing)3582
 
12.1%
ValueCountFrequency (%)
0.027
< 0.1%
0.033
< 0.1%
0.062
 
< 0.1%
0.092
 
< 0.1%
0.11
 
< 0.1%
0.112
 
< 0.1%
0.121
 
< 0.1%
0.131
 
< 0.1%
0.141
 
< 0.1%
0.181
 
< 0.1%
ValueCountFrequency (%)
390.681
< 0.1%
382.441
< 0.1%
351.31
< 0.1%
304.261
< 0.1%
289.751
< 0.1%
288.551
< 0.1%
287.141
< 0.1%
273.391
< 0.1%
270.091
< 0.1%
268.031
< 0.1%

NO2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct7404
Distinct (%)28.5%
Missing3585
Missing (%)12.1%
Infinite0
Infinite (%)0.0%
Mean28.56065906
Minimum0.01
Maximum362.21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:50.504711image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile4.93
Q111.75
median21.69
Q337.62
95-th percentile74.125
Maximum362.21
Range362.2
Interquartile range (IQR)25.87

Descriptive statistics

Standard deviation24.4747458
Coefficient of variation (CV)0.8569391114
Kurtosis11.211125
Mean28.56065906
Median Absolute Deviation (MAD)11.42
Skewness2.46455959
Sum741034.86
Variance599.0131818
MonotonicityNot monotonic
2021-12-26T20:02:50.600393image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.5824
 
0.1%
9.4223
 
0.1%
9.1418
 
0.1%
9.4717
 
0.1%
10.0917
 
0.1%
9.2417
 
0.1%
10.2117
 
0.1%
7.1417
 
0.1%
9.4417
 
0.1%
13.916
 
0.1%
Other values (7394)25763
87.2%
(Missing)3585
 
12.1%
ValueCountFrequency (%)
0.012
 
< 0.1%
0.025
< 0.1%
0.039
< 0.1%
0.042
 
< 0.1%
0.053
 
< 0.1%
0.063
 
< 0.1%
0.077
< 0.1%
0.085
< 0.1%
0.097
< 0.1%
0.14
< 0.1%
ValueCountFrequency (%)
362.211
< 0.1%
292.021
< 0.1%
277.311
< 0.1%
273.391
< 0.1%
266.461
< 0.1%
245.621
< 0.1%
241.341
< 0.1%
239.181
< 0.1%
239.11
< 0.1%
237.271
< 0.1%

NOx
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct8156
Distinct (%)32.2%
Missing4185
Missing (%)14.2%
Infinite0
Infinite (%)0.0%
Mean32.30912333
Minimum0
Maximum467.63
Zeros740
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:50.709742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.4
Q112.82
median23.52
Q340.1275
95-th percentile96.3575
Maximum467.63
Range467.63
Interquartile range (IQR)27.3075

Descriptive statistics

Standard deviation31.64601094
Coefficient of variation (CV)0.9794760016
Kurtosis10.83633513
Mean32.30912333
Median Absolute Deviation (MAD)12.69
Skewness2.569914617
Sum818907.04
Variance1001.470008
MonotonicityNot monotonic
2021-12-26T20:02:50.803470image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0740
 
2.5%
4.22208
 
0.7%
6.24115
 
0.4%
4.335
 
0.1%
2.2131
 
0.1%
4.9519
 
0.1%
4.1418
 
0.1%
4.4717
 
0.1%
4.9716
 
0.1%
4.0514
 
< 0.1%
Other values (8146)24133
81.7%
(Missing)4185
 
14.2%
ValueCountFrequency (%)
0740
2.5%
0.034
 
< 0.1%
0.049
 
< 0.1%
0.053
 
< 0.1%
0.062
 
< 0.1%
0.072
 
< 0.1%
0.091
 
< 0.1%
0.13
 
< 0.1%
0.112
 
< 0.1%
0.121
 
< 0.1%
ValueCountFrequency (%)
467.631
< 0.1%
382.841
< 0.1%
378.311
< 0.1%
378.241
< 0.1%
302.781
< 0.1%
293.11
< 0.1%
289.091
< 0.1%
287.891
< 0.1%
273.331
< 0.1%
271.941
< 0.1%

NH3
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct5922
Distinct (%)30.8%
Missing10328
Missing (%)35.0%
Infinite0
Infinite (%)0.0%
Mean23.48347602
Minimum0.01
Maximum352.89
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:50.912820image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile2.74
Q18.58
median15.85
Q330.02
95-th percentile63.427
Maximum352.89
Range352.88
Interquartile range (IQR)21.44

Descriptive statistics

Standard deviation25.684275
Coefficient of variation (CV)1.093716917
Kurtosis27.9646081
Mean23.48347602
Median Absolute Deviation (MAD)9.25
Skewness4.083993436
Sum450953.19
Variance659.6819821
MonotonicityNot monotonic
2021-12-26T20:02:51.120886image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.2936
 
0.1%
6.3229
 
0.1%
6.328
 
0.1%
6.3128
 
0.1%
6.2827
 
0.1%
6.2724
 
0.1%
10.4623
 
0.1%
6.5922
 
0.1%
6.3321
 
0.1%
6.621
 
0.1%
Other values (5912)18944
64.1%
(Missing)10328
35.0%
ValueCountFrequency (%)
0.012
 
< 0.1%
0.026
< 0.1%
0.041
 
< 0.1%
0.051
 
< 0.1%
0.061
 
< 0.1%
0.082
 
< 0.1%
0.11
 
< 0.1%
0.114
< 0.1%
0.123
< 0.1%
0.132
 
< 0.1%
ValueCountFrequency (%)
352.891
< 0.1%
328.891
< 0.1%
323.481
< 0.1%
309.041
< 0.1%
303.531
< 0.1%
302.081
< 0.1%
301.281
< 0.1%
301.181
< 0.1%
297.641
< 0.1%
296.431
< 0.1%

CO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1779
Distinct (%)6.5%
Missing2059
Missing (%)7.0%
Infinite0
Infinite (%)0.0%
Mean2.248598209
Minimum0
Maximum175.81
Zeros2328
Zeros (%)7.9%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:51.230235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.51
median0.89
Q31.45
95-th percentile8.0245
Maximum175.81
Range175.81
Interquartile range (IQR)0.94

Descriptive statistics

Standard deviation6.962884254
Coefficient of variation (CV)3.096544428
Kurtosis109.4880503
Mean2.248598209
Median Absolute Deviation (MAD)0.44
Skewness8.878321522
Sum61773.49
Variance48.48175714
MonotonicityNot monotonic
2021-12-26T20:02:51.323964image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02328
 
7.9%
0.68209
 
0.7%
0.85208
 
0.7%
0.8205
 
0.7%
0.89203
 
0.7%
0.84200
 
0.7%
0.78200
 
0.7%
0.81199
 
0.7%
0.64198
 
0.7%
0.67194
 
0.7%
Other values (1769)23328
79.0%
(Missing)2059
 
7.0%
ValueCountFrequency (%)
02328
7.9%
0.0159
 
0.2%
0.0259
 
0.2%
0.0356
 
0.2%
0.0430
 
0.1%
0.0548
 
0.2%
0.0642
 
0.1%
0.0740
 
0.1%
0.0834
 
0.1%
0.0938
 
0.1%
ValueCountFrequency (%)
175.811
< 0.1%
145.321
< 0.1%
134.851
< 0.1%
132.471
< 0.1%
132.071
< 0.1%
124.011
< 0.1%
119.681
< 0.1%
119.31
< 0.1%
118.021
< 0.1%
1181
< 0.1%

SO2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct4761
Distinct (%)18.5%
Missing3854
Missing (%)13.1%
Infinite0
Infinite (%)0.0%
Mean14.53197726
Minimum0.01
Maximum193.86
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:51.433312image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile2.63
Q15.67
median9.16
Q315.22
95-th percentile46.208
Maximum193.86
Range193.85
Interquartile range (IQR)9.55

Descriptive statistics

Standard deviation18.13377485
Coefficient of variation (CV)1.247853236
Kurtosis22.0671006
Mean14.53197726
Median Absolute Deviation (MAD)4.12
Skewness4.083659555
Sum373137.58
Variance328.8337902
MonotonicityNot monotonic
2021-12-26T20:02:51.545434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.7436
 
0.1%
6.1235
 
0.1%
6.6132
 
0.1%
5.8132
 
0.1%
5.5332
 
0.1%
4.6532
 
0.1%
6.4731
 
0.1%
5.5731
 
0.1%
5.9531
 
0.1%
5.1330
 
0.1%
Other values (4751)25355
85.9%
(Missing)3854
 
13.1%
ValueCountFrequency (%)
0.011
< 0.1%
0.041
< 0.1%
0.211
< 0.1%
0.261
< 0.1%
0.361
< 0.1%
0.412
< 0.1%
0.421
< 0.1%
0.441
< 0.1%
0.481
< 0.1%
0.491
< 0.1%
ValueCountFrequency (%)
193.861
< 0.1%
187.021
< 0.1%
186.081
< 0.1%
182.391
< 0.1%
180.851
< 0.1%
179.181
< 0.1%
178.931
< 0.1%
178.631
< 0.1%
178.581
< 0.1%
176.881
< 0.1%

O3
Real number (ℝ≥0)

MISSING

Distinct7699
Distinct (%)30.2%
Missing4022
Missing (%)13.6%
Infinite0
Infinite (%)0.0%
Mean34.49143048
Minimum0.01
Maximum257.73
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:51.654782image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile7.02
Q118.86
median30.84
Q345.57
95-th percentile74.142
Maximum257.73
Range257.72
Interquartile range (IQR)26.71

Descriptive statistics

Standard deviation21.69492819
Coefficient of variation (CV)0.6289947356
Kurtosis3.429464538
Mean34.49143048
Median Absolute Deviation (MAD)12.96
Skewness1.330119322
Sum879841.9
Variance470.6699093
MonotonicityNot monotonic
2021-12-26T20:02:51.764131image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.4817
 
0.1%
22.1415
 
0.1%
23.615
 
0.1%
19.6414
 
< 0.1%
18.3314
 
< 0.1%
22.9413
 
< 0.1%
13.1413
 
< 0.1%
32.0613
 
< 0.1%
19.6813
 
< 0.1%
25.312
 
< 0.1%
Other values (7689)25370
85.9%
(Missing)4022
 
13.6%
ValueCountFrequency (%)
0.014
< 0.1%
0.027
< 0.1%
0.032
 
< 0.1%
0.043
 
< 0.1%
0.052
 
< 0.1%
0.063
 
< 0.1%
0.071
 
< 0.1%
0.18
< 0.1%
0.112
 
< 0.1%
0.121
 
< 0.1%
ValueCountFrequency (%)
257.731
< 0.1%
200.411
< 0.1%
193.311
< 0.1%
186.071
< 0.1%
177.071
< 0.1%
175.041
< 0.1%
172.281
< 0.1%
169.361
< 0.1%
169.351
< 0.1%
165.481
< 0.1%

Benzene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct1873
Distinct (%)7.8%
Missing5623
Missing (%)19.0%
Infinite0
Infinite (%)0.0%
Mean3.280840305
Minimum0
Maximum455.03
Zeros3802
Zeros (%)12.9%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:51.873481image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.12
median1.07
Q33.08
95-th percentile9.72
Maximum455.03
Range455.03
Interquartile range (IQR)2.96

Descriptive statistics

Standard deviation15.81113642
Coefficient of variation (CV)4.81923378
Kurtosis530.1714706
Mean3.280840305
Median Absolute Deviation (MAD)1.06
Skewness21.30421849
Sum78438.33
Variance249.9920349
MonotonicityNot monotonic
2021-12-26T20:02:51.967208image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03802
 
12.9%
0.03300
 
1.0%
0.02292
 
1.0%
0.01217
 
0.7%
0.04190
 
0.6%
0.05176
 
0.6%
0.09170
 
0.6%
2170
 
0.6%
0.1167
 
0.6%
0.08157
 
0.5%
Other values (1863)18267
61.9%
(Missing)5623
 
19.0%
ValueCountFrequency (%)
03802
12.9%
0.01217
 
0.7%
0.02292
 
1.0%
0.03300
 
1.0%
0.04190
 
0.6%
0.05176
 
0.6%
0.06146
 
0.5%
0.07123
 
0.4%
0.08157
 
0.5%
0.09170
 
0.6%
ValueCountFrequency (%)
455.031
< 0.1%
454.851
< 0.1%
449.381
< 0.1%
448.591
< 0.1%
445.831
< 0.1%
443.631
< 0.1%
438.011
< 0.1%
435.91
< 0.1%
435.091
< 0.1%
432.941
< 0.1%

Toluene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct3608
Distinct (%)16.8%
Missing8041
Missing (%)27.2%
Infinite0
Infinite (%)0.0%
Mean8.70097208
Minimum0
Maximum454.85
Zeros2861
Zeros (%)9.7%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:52.065357image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.6
median2.97
Q39.15
95-th percentile33.92
Maximum454.85
Range454.85
Interquartile range (IQR)8.55

Descriptive statistics

Standard deviation19.96916366
Coefficient of variation (CV)2.295049734
Kurtosis216.7455066
Mean8.70097208
Median Absolute Deviation (MAD)2.94
Skewness11.66612883
Sum186983.89
Variance398.7674972
MonotonicityNot monotonic
2021-12-26T20:02:52.174708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02861
 
9.7%
0.02111
 
0.4%
0.03102
 
0.3%
0.0599
 
0.3%
0.0486
 
0.3%
1.183
 
0.3%
679
 
0.3%
0.0876
 
0.3%
0.0672
 
0.2%
0.0170
 
0.2%
Other values (3598)17851
60.4%
(Missing)8041
27.2%
ValueCountFrequency (%)
02861
9.7%
0.0170
 
0.2%
0.02111
 
0.4%
0.03102
 
0.3%
0.0486
 
0.3%
0.0599
 
0.3%
0.0672
 
0.2%
0.0761
 
0.2%
0.0876
 
0.3%
0.0954
 
0.2%
ValueCountFrequency (%)
454.851
< 0.1%
454.121
< 0.1%
449.141
< 0.1%
448.871
< 0.1%
445.841
< 0.1%
443.631
< 0.1%
437.771
< 0.1%
435.941
< 0.1%
434.921
< 0.1%
433.021
< 0.1%

Xylene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1561
Distinct (%)13.7%
Missing18109
Missing (%)61.3%
Infinite0
Infinite (%)0.0%
Mean3.070127823
Minimum0
Maximum170.37
Zeros1747
Zeros (%)5.9%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:52.284055image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.14
median0.98
Q33.35
95-th percentile12.558
Maximum170.37
Range170.37
Interquartile range (IQR)3.21

Descriptive statistics

Standard deviation6.323247407
Coefficient of variation (CV)2.059603955
Kurtosis119.9801163
Mean3.070127823
Median Absolute Deviation (MAD)0.98
Skewness7.891515254
Sum35067
Variance39.98345777
MonotonicityNot monotonic
2021-12-26T20:02:52.377783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01747
 
5.9%
0.1255
 
0.9%
2142
 
0.5%
0.65120
 
0.4%
0.12108
 
0.4%
0.1193
 
0.3%
0.1580
 
0.3%
0.1380
 
0.3%
0.1677
 
0.3%
0.5276
 
0.3%
Other values (1551)8644
29.3%
(Missing)18109
61.3%
ValueCountFrequency (%)
01747
5.9%
0.0168
 
0.2%
0.0250
 
0.2%
0.0352
 
0.2%
0.0442
 
0.1%
0.0552
 
0.2%
0.0656
 
0.2%
0.0772
 
0.2%
0.0862
 
0.2%
0.0962
 
0.2%
ValueCountFrequency (%)
170.371
< 0.1%
137.451
< 0.1%
125.181
< 0.1%
116.621
< 0.1%
109.231
< 0.1%
105.761
< 0.1%
94.481
< 0.1%
89.71
< 0.1%
84.721
< 0.1%
81.261
< 0.1%

AQI
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct829
Distinct (%)3.3%
Missing4681
Missing (%)15.9%
Infinite0
Infinite (%)0.0%
Mean166.4635815
Minimum13
Maximum2049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size230.8 KiB
2021-12-26T20:02:52.487138image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile50
Q181
median118
Q3208
95-th percentile407
Maximum2049
Range2036
Interquartile range (IQR)127

Descriptive statistics

Standard deviation140.6965851
Coefficient of variation (CV)0.8452094076
Kurtosis21.42372711
Mean166.4635815
Median Absolute Deviation (MAD)48
Skewness3.396757198
Sum4136620
Variance19795.52906
MonotonicityNot monotonic
2021-12-26T20:02:52.710576image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
102223
 
0.8%
100222
 
0.8%
70208
 
0.7%
106208
 
0.7%
78198
 
0.7%
98195
 
0.7%
66192
 
0.7%
104192
 
0.7%
80190
 
0.6%
92187
 
0.6%
Other values (819)22835
77.3%
(Missing)4681
 
15.9%
ValueCountFrequency (%)
131
 
< 0.1%
143
 
< 0.1%
153
 
< 0.1%
164
 
< 0.1%
177
 
< 0.1%
182
 
< 0.1%
1927
0.1%
2029
0.1%
217
 
< 0.1%
228
 
< 0.1%
ValueCountFrequency (%)
20491
< 0.1%
19171
< 0.1%
18421
< 0.1%
17471
< 0.1%
17191
< 0.1%
16721
< 0.1%
16461
< 0.1%
16301
< 0.1%
16131
< 0.1%
15951
< 0.1%

AQI_Bucket
Categorical

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing4681
Missing (%)15.9%
Memory size230.8 KiB
Moderate
8829 
Satisfactory
8224 
Poor
2781 
Very Poor
2337 
Good
1341 

Length

Max length12
Median length8
Mean length8.646639839
Min length4

Characters and Unicode

Total characters214869
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPoor
2nd rowVery Poor
3rd rowSevere
4th rowSevere
5th rowSevere

Common Values

ValueCountFrequency (%)
Moderate8829
29.9%
Satisfactory8224
27.8%
Poor2781
 
9.4%
Very Poor2337
 
7.9%
Good1341
 
4.5%
Severe1338
 
4.5%
(Missing)4681
15.9%

Length

2021-12-26T20:02:52.804305image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-26T20:02:52.866790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
moderate8829
32.5%
satisfactory8224
30.2%
poor5118
18.8%
very2337
 
8.6%
good1341
 
4.9%
severe1338
 
4.9%

Most occurring characters

ValueCountFrequency (%)
o29971
13.9%
r25846
12.0%
a25277
11.8%
t25277
11.8%
e24009
11.2%
y10561
 
4.9%
d10170
 
4.7%
S9562
 
4.5%
M8829
 
4.1%
c8224
 
3.8%
Other values (8)37143
17.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter185345
86.3%
Uppercase Letter27187
 
12.7%
Space Separator2337
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o29971
16.2%
r25846
13.9%
a25277
13.6%
t25277
13.6%
e24009
13.0%
y10561
 
5.7%
d10170
 
5.5%
c8224
 
4.4%
s8224
 
4.4%
f8224
 
4.4%
Other values (2)9562
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
S9562
35.2%
M8829
32.5%
P5118
18.8%
V2337
 
8.6%
G1341
 
4.9%
Space Separator
ValueCountFrequency (%)
2337
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin212532
98.9%
Common2337
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o29971
14.1%
r25846
12.2%
a25277
11.9%
t25277
11.9%
e24009
11.3%
y10561
 
5.0%
d10170
 
4.8%
S9562
 
4.5%
M8829
 
4.2%
c8224
 
3.9%
Other values (7)34806
16.4%
Common
ValueCountFrequency (%)
2337
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII214869
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o29971
13.9%
r25846
12.0%
a25277
11.8%
t25277
11.8%
e24009
11.2%
y10561
 
4.9%
d10170
 
4.7%
S9562
 
4.5%
M8829
 
4.1%
c8224
 
3.8%
Other values (8)37143
17.3%

Interactions

2021-12-26T20:02:47.218936image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:31.508372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.727859image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.984781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.330953image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.535051image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.036784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.223931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.796903image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.260077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.528887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.764750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.888118image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.309692image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:31.646977image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.906276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.078509image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.409060image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.623814image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.127804image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.317659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.917487image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.445771image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.607720image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.848864image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.977958image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.401448image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:31.730655image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.002020image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.184351image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.502789image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.701270image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.214571image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.458834image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.022552image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.534326image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.692808image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.931088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.062122image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.502605image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:31.834976image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.098080image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.288557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.608903image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.821015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.303816image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.687222image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.126072image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.627986image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.785384image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.015159image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.149445image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.600409image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:31.922048image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.187343image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.382211image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.687011image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.933766image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.381547image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.795057image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.217229image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.722592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.875960image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.106383image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.241199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.694525image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.017072image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.278766image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.483540image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.785256image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.011738image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.490906image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.018461image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.315000image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.816596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.964222image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.191855image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.330959image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.788404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.104918image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.368371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.577250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.878985image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.120547image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.591256image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.117194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.422605image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.906514image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.053640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.278107image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.426703image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.886132image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.178204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.450584image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.686942image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.972713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.203560image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.681015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.217946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.530314image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.998322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.143535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.366452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.521344image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:48.039292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.280369image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.546618image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.778918image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.071710image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.409257image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.785843image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.325181image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.642093image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.091632image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.235494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.458759image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.617115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:48.202862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.369185image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.627370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.872647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.167453image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.506995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.895192image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.417922image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.727115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.178935image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.320772image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.542170image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.713478image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:48.312811image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.450692image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.734841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:34.972731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.255233image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.604939image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:38.969113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.510675image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.827612image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.261050image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.403937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.623291image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:46.924324image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:48.433765image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.537711image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.812946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.149403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.339878image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.738581image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.047220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.608413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:41.983196image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.342309image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.482901image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.702979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.016078image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:48.572043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:32.634616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:33.906675image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:35.238105image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:36.428780image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:37.917104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:39.140948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:40.703176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:42.162418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:43.436112image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:44.661794image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:45.793815image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T20:02:47.112017image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-12-26T20:02:52.944896image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-26T20:02:53.069867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-26T20:02:53.266790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-26T20:02:53.393872image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-26T20:02:53.503221image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-26T20:02:48.766364image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-26T20:02:48.995197image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-12-26T20:02:49.333513image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-12-26T20:02:49.546992image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CityDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_Bucket
0Ahmedabad2015-01-01NaNNaN0.9218.2217.15NaN0.9227.64133.360.000.020.00NaNNaN
1Ahmedabad2015-01-02NaNNaN0.9715.6916.46NaN0.9724.5534.063.685.503.77NaNNaN
2Ahmedabad2015-01-03NaNNaN17.4019.3029.70NaN17.4029.0730.706.8016.402.25NaNNaN
3Ahmedabad2015-01-04NaNNaN1.7018.4817.97NaN1.7018.5936.084.4310.141.00NaNNaN
4Ahmedabad2015-01-05NaNNaN22.1021.4237.76NaN22.1039.3339.317.0118.892.78NaNNaN
5Ahmedabad2015-01-06NaNNaN45.4138.4881.50NaN45.4145.7646.515.4210.831.93NaNNaN
6Ahmedabad2015-01-07NaNNaN112.1640.62130.77NaN112.1632.2833.470.000.000.00NaNNaN
7Ahmedabad2015-01-08NaNNaN80.8736.7496.75NaN80.8738.5431.890.000.000.00NaNNaN
8Ahmedabad2015-01-09NaNNaN29.1631.0048.00NaN29.1658.6825.750.000.000.00NaNNaN
9Ahmedabad2015-01-10NaNNaNNaN7.040.00NaNNaN8.294.550.000.000.00NaNNaN

Last rows

CityDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_Bucket
29521Visakhapatnam2020-06-2233.17108.225.5842.4527.0613.700.7313.6534.853.9910.242.3295.0Satisfactory
29522Visakhapatnam2020-06-2325.4083.382.7634.0919.9213.130.5410.4043.272.8812.031.33100.0Satisfactory
29523Visakhapatnam2020-06-2434.3690.901.2223.3813.1214.450.5610.9235.122.993.151.6086.0Satisfactory
29524Visakhapatnam2020-06-2513.4558.542.3021.6013.0912.270.418.1929.381.285.640.9277.0Satisfactory
29525Visakhapatnam2020-06-267.6332.275.9123.2717.1911.150.466.8719.901.455.371.4547.0Good
29526Visakhapatnam2020-06-2715.0250.947.6825.0619.5412.470.478.5523.302.2412.070.7341.0Good
29527Visakhapatnam2020-06-2824.3874.093.4226.0616.5311.990.5212.7230.140.742.210.3870.0Satisfactory
29528Visakhapatnam2020-06-2922.9165.733.4529.5318.3310.710.488.4230.960.010.010.0068.0Satisfactory
29529Visakhapatnam2020-06-3016.6449.974.0529.2618.8010.030.529.8428.300.000.000.0054.0Satisfactory
29530Visakhapatnam2020-07-0115.0066.000.4026.8514.055.200.592.1017.05NaNNaNNaN50.0Good